AITopics | feature evolution

Collaborating Authors

feature evolution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Neural Collapse Perspective on Feature Evolution in Graph Neural Networks

Neural Information Processing SystemsDec-24-2025, 09:57:07 GMT

Graph neural networks (GNNs) have become increasingly popular for classification tasks on graph-structured data. Yet, the interplay between graph topology and feature evolution in GNNs is not well understood. In this paper, we focus on node-wise classification, illustrated with community detection on stochastic block model graphs, and explore the feature evolution through the lens of the Neural Collapse (NC) phenomenon. When training instance-wise deep classifiers (e.g. for image classification) beyond the zero training error point, NC demonstrates a reduction in the deepest features' within-class variability and an increased alignment of their class means to certain symmetric structures. We start with an empirical study that shows that a decrease in within-class variability is also prevalent in the node-wise classification setting, however, not to the extent observed in the instance-wise case. Then, we theoretically study this distinction. Specifically, we show that even an optimistic mathematical model requires that the graphs obey a strict structural condition in order to possess a minimizer with exact collapse. Furthermore, by studying the gradient dynamics of this model, we provide reasoning for the partial collapse observed empirically. Finally, we present a study on the evolution of within-and between-class feature variability across layers of a well-trained GNN and contrast the behavior with spectral methods.

feature evolution, name change, neural collapse perspective, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.59)

Add feedback

Evolution of Concepts in Language Model Pre-Training

Ge, Xuyang, Shu, Wentao, Wu, Jiaxing, Zhou, Yunhua, He, Zhengfu, Qiu, Xipeng

arXiv.org Artificial IntelligenceSep-23-2025

Language models obtain extensive capabilities through pre-training. However, the pre-training process remains a black box. In this work, we track linear interpretable feature evolution across pre-training snapshots using a sparse dictionary learning method called crosscoders. We find that most features begin to form around a specific point, while more complex patterns emerge in later training stages. Feature attribution analyses reveal causal connections between feature evolution and downstream performance. Our feature-level observations are highly consistent with previous findings on Transformer's two-stage learning process, which we term a statistical learning phase and a feature learning phase. Our work opens up the possibility to track fine-grained representation progress during language model learning dynamics.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.17196

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.04)
(14 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Tracking the Feature Dynamics in LLM Training: A Mechanistic Study

Xu, Yang, Wang, Yi, Wang, Hao

arXiv.org Artificial IntelligenceDec-23-2024

Understanding training dynamics and feature evolution is crucial for the mechanistic interpretability of large language models (LLMs). Although sparse autoencoders (SAEs) have been used to identify features within LLMs, a clear picture of how these features evolve during training remains elusive. In this study, we: (1) introduce SAE-Track, a method to efficiently obtain a continual series of SAEs; (2) formulate the process of feature formation and conduct a mechanistic analysis; and (3) analyze and visualize feature drift during training. Our work provides new insights into the dynamics of features in LLMs, enhancing our understanding of training mechanisms and feature evolution.

checkpoint, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.17626

Country:

North America > United States > North Carolina (0.04)
North America > United States > New Jersey (0.04)
Europe > Spain > Catalonia (0.04)
(3 more...)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

A Neural Collapse Perspective on Feature Evolution in Graph Neural Networks

Neural Information Processing SystemsOct-10-2024, 20:45:08 GMT

feature evolution, graph neural network, neural collapse perspective, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.62)

Add feedback

Counterfactual Augmentation for Multimodal Learning Under Presentation Bias

Lin, Victoria, Morency, Louis-Philippe, Dimitriadis, Dimitrios, Sharma, Srinagesh

arXiv.org Artificial IntelligenceOct-30-2023

In real-world machine learning systems, labels are often derived from user behaviors that the system wishes to encourage. Over time, new models must be trained as new training examples and features become available. However, feedback loops between users and models can bias future user behavior, inducing a presentation bias in the labels that compromises the ability to train new models. In this paper, we propose counterfactual augmentation, a novel causal method for correcting presentation bias using generated counterfactual labels. Our empirical evaluations demonstrate that counterfactual augmentation yields better downstream performance compared to both uncorrected models and existing bias-correction methods. Model analyses further indicate that the generated counterfactuals align closely with true counterfactuals in an oracle setting.

counterfactual, counterfactual augmentation, presentation bias, (15 more...)

arXiv.org Artificial Intelligence

2305.14083

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Rapid Feature Evolution Accelerates Learning in Neural Networks

Shan, Haozhe, Bordelon, Blake

arXiv.org Machine LearningMay-29-2021

Neural network (NN) training and generalization in the infinite-width limit are well-characterized by kernel methods with a neural tangent kernel (NTK) that is stationary in time. However, finite-width NNs consistently outperform corresponding kernel methods, suggesting the importance of feature learning, which manifests as the time evolution of NTKs. Here, we analyze the phenomenon of kernel alignment of the NTK with the target functions during gradient descent. We first provide a mechanistic explanation for why alignment between task and kernel occurs in deep linear networks. We then show that this behavior occurs more generally if one optimizes the feature map over time to accelerate learning while constraining how quickly the features evolve. Empirically, gradient descent undergoes a feature learning phase, during which top eigenfunctions of the NTK quickly align with the target function and the loss decreases faster than power law in time; it then enters a kernel gradient descent (KGD) phase where the alignment does not improve significantly and the training loss decreases in power law. We show that feature evolution is faster and more dramatic in deeper networks. We also found that networks with multiple output nodes develop separate, specialized kernels for each output channel, a phenomenon we termed kernel specialization. We show that this class-specific alignment is does not occur in linear networks.

alignment, kernel, neural network, (15 more...)

arXiv.org Machine Learning

2105.14301

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Evolving Metric Learning for Incremental and Decremental Features

Dong, Jiahua, Cong, Yang, Sun, Gan, Zhang, Tao, Xu, Xiaowei

arXiv.org Machine LearningJun-27-2020

Online metric learning has been widely exploited for large-scale data classification due to the low computational cost. However, amongst online practical scenarios where the features are evolving (e.g., some features are vanished and some new features are augmented), most metric learning models cannot be successfully applied into these scenarios although they can tackle the evolving instances efficiently. To address the challenge, we propose a new online Evolving Metric Learning (EML) model for incremental and decremental features, which can handle the instance and feature evolutions simultaneously by incorporating with a smoothed Wasserstein metric distance. Specifically, our model contains two essential stages: the Transforming stage (T-stage) and the Inheriting stage (I-stage). For the T-stage, we propose to extract important information from vanished features while neglecting non-informative knowledge, and forward it into survived features by transforming them into a low-rank discriminative metric space. It further explores the intrinsic low-rank structure of heterogeneous samples to reduce the computation and memory burden especially for highly-dimensional large-scale data. For the I-stage, we inherit the metric performance of survived features from the T-stage and then expand to include the augmented new features. Moreover, the smoothed Wasserstein distance is utilized to characterize the similarity relations among the complex and heterogeneous data, since the evolving features in the different stages are not strictly aligned. In addition to tackling the challenges in one-shot case, we also extend our model into multi-shot scenario. After deriving an efficient optimization method for both T-stage and I-stage, extensive experiments on several benchmark datasets verify the superiority of our model.

i-stage, metric learning, scenario, (14 more...)

arXiv.org Machine Learning

2006.15334

Country:

Asia > China > Liaoning Province > Shenyang (0.05)
North America > United States > Arkansas > Pulaski County > Little Rock (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report (0.50)
Instructional Material (0.35)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback